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1 Abstract 

2 

3 Land surface (or "skin") temperature (LST) lies at the heart of the surface energy balance and is a 

4 key variable in weather and climate models. Here we assimilate LST retrievals from the 

5 International Satellite Cloud Climatology Project (ISCCP) into the Noah and Catchment (CLSM) 

6 land surface models using an ensemble-based, off-line land data assimilation system. LST is 

7 described very differently in the two models. A priori scaling and dynamic bias estimation 

8 approaches are applied because satellite and model LST typically exhibit different mean values 

9 and variability. Performance is measured against 27 months of in situ measurements from the 

10 Coordinated Energy and Water Cycle Observations Project at 48 stations. LST estimates from 

1 1 Noah and CLSM without data assimilation ("open loop") are comparable to each other and 

12 superior to that of ISCCP retrievals. For LST, RMSE values are 4.9 K (CLSM), 5.6 K (Noah), 

13 and 7.6 K (ISCCP), and anomaly correlation coefficients (R) are 0.62 (CLSM), 0.61 (Noah), and 

14 0.52 (ISCCP). Assimilation of ISCCP retrievals provides modest yet statistically significant 

15 improvements (over open loop) of up to 0.7 K in RMSE and 0.05 in anomaly R. The skill of 

16 surface turbulent flux estimates from the assimilation integrations is essentially identical to the 

17 corresponding open loop skill. Noah assimilation estimates of ground heat flux, however, can be 

18 significantly worse than open loop estimates. Provided the assimilation system is properly 

19 adapted to each land model, the benefits from the assimilation of LST retrievals are comparable 

20 for both models. 

21 
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22 1 . Introduction 

23 Land surface conditions are intimately connected with the global climate system and have been 

24 associated, through different pathways, with atmospheric predictability. Land surface 

25 temperature (LST) lies at the heart of the surface energy balance and is therefore a key variable 

26 in weather and climate models. LST influences the latent and sensible heat fluxes to the 

27 atmosphere through which it affects the planetary boundary layer and atmospheric convection. 

28 LST also plays an important role in the assimilation of atmospheric remote sensing observations 

29 because atmospheric retrieval algorithms (or forward radiative transfer modeling) for surface- 

30 sensitive (window) channels require information about land surface conditions. Accurate LST 

3 1 specification is therefore critical to improving estimates of the surface water, energy, and 

32 radiation balance as well as atmospheric quantities, which in turn are all critical to improving 

33 weather and climate forecast accuracy. 

34 

35 Satellite retrievals of LST (also referred to as "skin temperature") are available from a variety of 

36 polar orbiting and geostationary platfonns carrying infrared and microwave sensors (Aires et al. 

37 2004, Jin 2004, Minnis and Khaiyer 2000, Pinheiro et al. 2004, Rossow and Schiffer 1991, 1999, 

38 Trigo and Viterbo 2003, Wan and Li 1997). Land surface models (driven by observed 

39 meteorological forcing data or coupled to an atmospheric model) offer estimates of global land 

40 surface conditions, including LST. Errors in the forcing fields, however, along with the 

41 imperfect parameterization of land-atmosphere interactions can lead to considerable drifts in 

42 modeled land surface states. Land data assimilation systems combine the complementary 

43 infonnation from modeled and observed land surface fields and produce dynamically consistent, 

44 spatially complete and temporally continuous estimates of global land surface conditions. 
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45 Assimilating LST retrievals into a land surface model should, in concept, improve model 

46 estimates of land surface conditions. 

47 

48 There has been considerable progress in the methodological development and application of land 

49 data assimilation algorithms (Andreadis and Lettenmaier 2005, Balsamo et al. 2007, Crow and 

50 Wood 2003, Drusch 2007, Dunne and Entekhabi 2006, Mahfouf et al. 2009, Margulis et al. 2002, 

5 1 Pan and Wood 2006, Reichle et al. 2009, Seuffert et al. 2003, Slater and Clark 2006, Walker et 

52 al. 2001, Zaitchik and Rodell 2009, Zhou et al. 2006), with ensemble-based Kalman filtering and 

53 smoothing algorithms emerging as a common and promising method for land data assimilation. 

54 Development and applications of land data assimilation, however, have largely focused on 

55 assimilating observations of surface soil moisture, snow cover, and snow water equivalent, with 

56 less effort devoted to LST assimilation. 

57 

58 The goal of this study is to investigate the potential for assimilating satellite retrievals of LST 

59 within a state-of-the-art land surface data assimilation system. Specifically, LST retrievals from 

60 the International Satellite Cloud Climatology Project (ISCCP) are assimilated into the NASA 

61 Catchment land surface model (CLSM; Koster et al. 2000) and into the Noah land surface model 

62 (Ek et al. 2003) with the ensemble Kalman filter (EnKF) developed at the NASA Global 

63 Modeling and Assimilation Office (Reichle et al. 2009). Lor validation of the assimilation 

64 products we use in situ observations from the Coordinated Energy and Water Cycle Observations 

65 Project (CEOP). We pay particular attention to bias between observed and modeled LST and 

66 have fitted the EnKF with several bias estimation algorithms designed specifically to address 

67 LST biases. It will be shown that the assimilation algorithm must be customized for the model- 

68 specific representation of LST. 
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69 2. Background 

70 LST can be retrieved from infrared and microwave sensors on geostationary and polar-orbiting 

7 1 platforms, including the NOAA Geostationary and Polar-orbiting Operational Environmental 

72 Satellite series dating back to the early 1980’s. Given the extensive global and multi-decadal 

73 record of satellite-based LST retrievals, and given the importance of accurate LST estimation in 

74 particular for global atmospheric data assimilation systems, it is telling that the challenge of 

75 operational LST assimilation has been largely unmet. 

76 

77 The difficulties of LST data assimilation are rooted in the nature of LST retrievals and modeling. 

78 LST data from retrievals and land surface models typically exhibit strong biases that depend on 

79 observation time and location and that have been well documented (see, for instance, Jin et al. 

80 1997, Trigo and Viterbo 2003, Jin 2004; see also section 6 for examples). Biases arise for a 

8 1 variety of reasons. For instance, LST modeling is fraught with numerical stability problems 

82 because in nature the effective heat capacity associated with LST is very small. Land modelers 

83 are thus forced to approximate the corresponding heat capacity as zero or to use a surface 

84 temperature prognostic variable that represents more than just a very thin layer. The first 

85 approach, used for example in Noah, derives LST as a diagnostic variable from the surface 

86 energy balance. The second approach, used for example in CLSM, lumps the vegetation and top 

87 few centimeters of soil matter into a single model prognostic variable with a small but non-zero 

88 heat capacity. The latter approach is obviously at odds with satellite retrievals of LST, which 

89 describe the temperature in a much shallower layer at the land-atmosphere interface (vegetation 

90 or soil, as viewed from the satellite sensor). On the other hand, the zero-heat capacity approach 

91 requires an additional connection between the diagnostic model LST variable and a model 
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92 prognostic variable to which data assimilation increments can be applied (so that they can alter 

93 the model forecast). Additional discrepancies between the LST observed by the satellite and that 

94 computed by the land model stem from the inability of global land models to resolve the land 

95 surface at the same fine horizontal resolution as infrared satellite sensors. 

96 

97 Satellite-based LST retrievals suffer from their own set of disadvantages and problems. Infrared 

98 LST retrievals are only available under clear-sky conditions and are notoriously prone to cloud 

99 contamination (Jin 2004). Microwave LST retrievals are available under cloudy conditions 

100 (Aires et al. 2004) but depend on uncertain estimates of microwave land surface emissivity. 

101 Both infrared and microwave LST retrievals depend on knowledge of atmospheric conditions 

102 above the LST footprint. LST retrievals also depend on the look-angle, which is particularly 

103 important for retrievals at high viewing angles (Minnis and Khaiyer 2000, Pinheiro et al. 2004). 

104 The benefit of having a great variety and long record of different platforms from which LST can 

105 be retrieved is partly negated by the corresponding variety of sensor characteristics and sensor- 

106 specific LST retrieval algorithms that make it difficult to achieve a homogeneous satellite LST 

107 record. 

108 

109 Additional complications arise when model and satellite LST are combined in a data assimilation 

110 system. The strong seasonal and diurnal cycles of LST must be considered because error 

111 characteristics may depend on time-of-day and season. Moreover, obvious problems result when 

112 (clear-sky) LST retrievals are assimilated into a model at a time and location for which the model 

113 state or forcing indicate cloudy conditions. Because the infrared and microwave emissivities of 

114 the land surface are not well known, it is difficult to compare the radiometric temperature 
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115 observed by the satellite with the physical temperature of the land model. Last but not least, the 

116 dearth of validating in situ observations of LST and land surface fluxes is a severe impediment to 

117 the validation of data products from satellite observations, modeling, and data assimilation. 

118 

119 One major development path is to assimilate LST retrievals into a simple representation of the 

120 land surface energy balance using an adjoint-based variational assimilation approach (Castelli et 

121 al. 1999; Boni et al. 2001; Caparrini et al. 2004; Sini et al. 2008). This elegant method requires 

122 only a minimal amount of ancillary data and provides robust estimates of evaporative fraction. It 

123 is not, however, easily applicable to existing global atmospheric or land data assimilation 

124 systems because it is very difficult to develop and maintain adjoint models for the complex land 

125 surface model components in such systems. Recently, Meng et al. (2009) developed the adjoint 

126 model of just the surface energy balance component of the Common Land Model. Using the 

127 variational method, the authors assimilated in situ LST observations from four AmeriFlux sites 

128 for up to 20 days and report improvements in evapotranspiration estimates when verified against 

129 coincident in situ observations. 

130 

131 Other off-line surface temperature assimilation studies used filtering techniques. Kumar and 

132 Kaleita (2003) used the Extended Kalman filter to assimilate in situ observations of surface soil 

133 temperature from a single site for one month into a soil heat transfer model based on the 

134 discretized diffusion equation. Lakshmi (2000) merged one year of satellite retrievals of LST 

135 over the Red-Arkansas basin into a simple two-layer model of the surface water and energy 

136 balance. 

137 
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138 A few attempts have been made to adjust terms in the surface energy balance of atmospheric 

139 models in response to satellite LST retrievals. McNider et al. (1994) describe a technique to 

140 assimilate satellite LST into the surface energy budget of a regional-scale atmospheric model. 

141 Their method assumes that surface soil moisture is the least known parameter in the early- 

142 morning surface energy budget. Van den Hurk et al. (2002) assimilated satellite LST and near- 

143 surface relative humidity measurements into a regional weather forecast model. By adjusting the 

144 root zone soil moisture and the roughness length for heat the authors find small improvements in 

145 the surface energy balance. Garand (2003) outlines a variational method for a unified land and 

146 ocean surface skin temperature analysis, including a linear a priori bias correction of the 

147 assimilated radiances. 

148 

149 Bosilovich et al. (2007) developed an algorithm for LST assimilation into a global model that 

150 introduces an incremental bias correction term into the model’s surface energy budget. In 

151 contrast to the McNider et al. (1994) approach, all temperature-dependent tenns in the surface 

152 energy budget respond directly to the LST retrievals. In its simplest form, the Bosilovich et al. 

153 (2007) algorithm estimates and corrects a constant time mean bias for each grid point; additional 

154 benefits are attained with a refined version of the algorithm that allows for a correction of the 

155 mean diurnal cycle. The results of Bosilovich et al. (2007) indicate that LST assimilation 

156 improves estimates of 2m air temperature, both in mean and variability, in a coupled land- 

157 atmosphere model. Neglecting the diurnal cycle of the LST bias causes degradation of the 

158 diurnal amplitude of background model air temperature in many regions. In situ measurements 

159 of energy fluxes at several locations were used to inspect the surface energy budget more closely. 

160 LST assimilation generally improves the sensible heat flux and, in some cases, it improves the 
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161 Bowen ratio. At many stations, however, LST assimilation increases slightly the bias in the 

162 monthly latent heat flux. A critical limitation of the method of Bosilovich et al. (2007) is the 

163 assumption of unbiased LST retrievals. 

164 

165 In this paper, we restrict ourselves to uncoupled land data assimilation and test several variants 

166 of the bias estimation strategy of Bosilovich et al. (2007). We also explore an alternative 

167 strategy of scaling the LST retrievals into the climatology of the modeled LST. As we will show 

168 in section 6, not scaling the LST retrievals prior to (uncoupled land) data assimilation can create 

169 serious imbalances in the model-generated mass and energy fluxes and can lead to entirely 

170 unrealistic land surface fluxes. We test these approaches with two land surface models that 

171 represent LST very differently: CLSM and Noah (section 4). Our results are directly linked to 

172 weather and climate prediction applications because these two land models are used in the 

173 atmospheric data assimilation systems of the NASA Global Modeling and Assimilation Office 

174 (GMAO) and the NOAA National Centers for Environmental Prediction (NCEP), respectively. 

175 
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176 3. Data 

177 In this study, we assimilate LST retrievals from ISCCP (http://isccp.giss.nasa.gov; Rossow and 

178 Schiffer 1991, 1999). The ISCCP archive contains satellite-based estimates of global cloud 

179 cover and radiative properties from 1983 through the present (recent data are added with a 

180 latency of about 1 year), and is based on observations from an international network of 

181 meteorological satellites. Specifically, the ISCCP 30 km Pixel Level Cloud Product (DX) 

182 includes global, 3-hourly, clear-sky LST retrievals from infrared radiances. For this study, we 

183 extracted LST retrievals from the DX archive for the geostationary platforms and aggregated the 

184 data to a global latitude-longitude grid with 1 degree resolution for assimilation into our system. 

185 

186 The availability of validating land surface temperature and flux data is very limited. In this 

187 study, we use the comparably large collection of such data provided by CEOP 

188 (http://www.ceop.net) to validate the data assimilation products. Specifically, we obtained 

189 hourly data from the network of CEOP Reference Sites from 1 October 2002 through 3 1 

190 December 2004 (Figure 1, Table 1). Sufficient data for validation are available at 48 distinct 

191 sites, of which 19 sites have LST data, 30 have latent heat (LH) and sensible heat (SH) flux data, 

192 and 20 sites have ground heat (GH) flux data. Only 4 stations have LST as well as LH, SH, and 

193 GH observations sufficient for validation (Cabauw, Bondville, Lindenberg Falkenberg, and 

194 Lindenberg Forest). The hourly CEOP data were aggregated to 3-hourly averages for 

195 comparison with the 3-hourly retrieval, model, and assimilation products. 

196 

197 The surface meteorological forcing data for the two land models are from the Global Land Data 

198 Assimilation Systems (GLDAS) project (Rodell et al. 2003; http://ldas.gsfc.nasa.gov) and were 
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199 provided at 3-hourly time steps and at 2° and 2.5° resolution in latitude and longitude, 

200 respectively. The GLDAS data stream is based on output from the global atmospheric data 

201 assimilation system at the NASA Global Modeling and Assimilation Office (Bloom et al. 2005) 

202 adjusted with pentad precipitation observations from the Climate Prediction Center Merged 

203 Analysis of Precipitation (CMAP; http://www.cdc.noaa.gov/cdc/data.cmap.html) and daily 

204 estimates of surface radiation from the Air Force Weather Agency (AFWA) Agricultural 

205 Meteorology (AGRMET) system. The observation-based corrections ensure that the forcing data 

206 and hence the land model output are as close to reality as is possible (without the benefit of 

207 assimilating the LST retrievals). 

208 
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209 4. Data assimilation system 

210 4.a. The Catchment and Noah land surface models 

211 Modeled LST and land surface fluxes are from integrations of CLSM (Koster et al. 2000) and 

212 Noah (Ek et al. 2003). Again, these two models are the land model components of the 

213 atmospheric data assimilation and forecasting systems at NASA GMAO and NOAA NCEP, 

214 respectively. Both models dynamically predict land surface water and energy fluxes in response 

215 to surface meteorological forcing but follow markedly different approaches to modeling soil 

216 moisture and LST. 

217 

218 CLSM’s basic computational unit is the hydrological catchment (or watershed). The global land 

219 surface is divided into catchments (excluding inland water and ice-covered areas) with a mean 

220 linear scale of around 50 km (ranging from a few km to 250 km). Unlike traditional, layer-based 

22 1 models, CLSM includes an explicit treatment of the spatial variation of soil water and water table 

222 depth within each hydrological catchment based on the statistics of the catchment topography. 

223 The surface energy balance is computed separately for the (dynamically varying) saturated, 

224 transpiring, and wilting sub-areas of each catchment. In each of these three sub-areas, the bulk 

225 temperature of the vegetation canopy and the top 5 cm of the soil column is modeled with a 

226 "surface temperature" (TSURF) prognostic variable that is specific to the soil moisture regime. 

227 The three TSURF prognostic variables interact with an underlying heat diffusion model for soil 

228 temperature (consisting of six layers with depths equal to about 10, 20, 40, 75, 150, 1000 cm 

229 from top to bottom) that is common to the three sub-areas (see Figure 5 of Koster et al. 2000). In 

230 the absence of snow, the area-weighted average of the three prognostic TSURF variables 

23 1 (hereinafter also referred to as the "surface temperature" in CLSM) is the most appropriate 
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232 quantity to compare to satellite-based LST retrievals (Figure 2). CLSM integrations were 

233 conducted using the GMAO land data assimilation system (Reichle et al. 2009) with a model 

234 time step of 20 minutes. 

235 

236 Noah is a more traditional, layer-based model. Four soil layers of increasing thicknesses of 10, 

237 30, 60 and 100 cm are used to model soil temperature and moisture dynamics with layer-based 

238 formulations of the heat diffusion equation (for energy) and of the standard diffusion and gravity 

239 drainage equations (for unsaturated water flow). LST in Noah is diagnosed from the surface 

240 energy balance equation and, unlike in CLSM, is not a prognostic variable and has no associated 

241 heat capacity (Figure 2). In this study, we use Noah version 2.7.1 on a 0.5 degree grid with a 30 

242 minute time step. Noah integrations were carried out with the Land Infonnation System (Kumar 

243 et al. 2008) fitted with the GMAO ensemble data assimilation and bias estimation modules 

244 (Reichle et al. 2009). 

245 

246 4.b. Data assimilation method and parameters 

247 In a data assimilation system, the model-generated land surface estimates are corrected toward 

248 observational estimates, with the degree of correction detennined by the levels of error 

249 associated with each. The assimilation system used here is based on the Ensemble Kalman filter 

250 (EnKF), which is well suited to the non-linear and intennittent character of land surface 

25 1 processes (Reichle et al. 2002a, 2002b). The EnKF works sequentially by perfonning in turn a 

252 model forecast step and a filter update step. Formally, the forecast step for ensemble member i 

253 can be written as 

254 
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255 (1) x t ,r = f(x t -i,i + , qu) 

256 

257 where x t) f and x t _i,i + are the forecast (denoted with ” ) and analysis (denoted with + ) state vectors 

258 at times t and t-1, respectively. The model error (or perturbation vector) is denoted with q u and 

259 its covariance with Q t . The filter update produces the analyzed state vector x L f at time t and can 

260 be written as 

261 

262 (2) x tjl + = x t ,f + K x>t ( y t ,i - H t x t> f ) 

263 

264 where y tj ; denotes the observation vector (suitably perturbed) and H t is the observation operator 

265 (which is written as if it were linear for ease of notation, but in practice the update is solved 

266 without explicitly computing H t ; Keppenne et al. 2000). The Kalman gain matrix K xt is given by 

267 

268 (3) K x , t = P x , t H t T ( H, P x , t H, T + R t f' 

269 

270 where P x , t is the state forecast error covariance (diagnosed from the ensemble x tJ ), R t is the 

27 1 observation error covariance, and superscript T denotes the matrix transpose. Simply put, the 

272 Kalman gain K x t represents the relative weights given to the model forecast and the 

273 observations, based on their respective uncertainties and based on the error correlations between 

274 the elements of the state vector and the model prediction of the observed variable. In this paper, 

275 we use 12 ensemble members with a "one-dimensional" (Id) EnKF that processes each location 

276 independently of all other locations (see, for example, Reichle and Koster (2003) for Id versus 
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277 3d assimilation). The key feature of the EnKF is that error estimates of the model-generated 

278 results are dynamically derived from an ensemble of non-linear model integrations. 

279 

280 Perturbation fields were generated and applied to surface air temperature and radiation which 

28 1 represent the dominant forcing inputs for LST. Perturbations were also applied to model 

282 prognostic variables (the Catchment surface temperature TSURF and the Noah top layer soil 

283 temperature TSOIL1), reflecting errors in the modeling of the surface energy balance. 

284 Collectively, the perturbations allow us to maintain an ensemble of land surface conditions that 

285 represents the uncertainty in modeled LST. An overview of the perturbation parameters is given 

286 in Table 2. Depending on the variable, normally distributed additive perturbations or log- 

287 normally distributed multiplicative perturbations were applied. The ensemble mean for all 

288 perturbations was constrained to zero for additive perturbations and to one for multiplicative 

289 perturbations. Moreover, time series correlations were imposed via a first-order auto-regressive 

290 model (AR(1)) for all fields. Since we used a one-dimensional EnKF in this study, the 

291 perturbation fields were not spatially correlated. At hourly and daily time scales, the 

292 meteorological forcing fields are ultimately based on output from atmospheric modeling and 

293 analysis systems and not on direct observations of surface forcings. We imposed error cross- 

294 correlations that are motivated by the assumption that the atmospheric forcing fields represent a 

295 realistic balance between radiation, clouds, and air temperature. Under that assumption, for 

296 example, a positive perturbation to the downward shortwave radiation tends to be associated with 

297 negative perturbations to the longwave radiation and with a positive perturbation to air 

298 temperature. 

299 
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300 Model errors are difficult to quantify at the global scale. The parameter values listed in Table 2 

301 are largely based on experience and are partially supported by earlier studies (Reichle et al. 

302 2002b; Reichle and Koster 2003; Reichle et al. 2007). The success of the assimilation presented 

303 here (section 6) suggests that these values are acceptable. In any case, further calibration of the 

304 filter parameters would, in theory, only improve the assimilation results. Clearly, more research 

305 is needed on the exact nature of the model and forcing errors. Recently developed adaptive 

306 filtering methods for land assimilation may assist with error parameter estimation (Reichle et al. 

307 2008, Crow and Reichle 2008). 

308 

309 The mapping of the satellite infonnation from the observation space into the space of the model 

310 states is accomplished through the Kalman gain during the EnKF update step. Equation (2) 

311 linearly relates "innovations" (observations minus corresponding model estimates before EnKF 

312 update, that is, y t ,i - H t x tJ ) to "increments" (model states after EnKF update minus same before 

313 EnKF update, that is, x tj i + - x tj i~ )• In this study, we use CLSM’s area-average TSURF variable 

3 14 and Noah’s diagnostic TSKIN variable to compute the (observation-space) innovations (Figure 2 

315 and Section 4a). The EnKF state vector for CLSM consists of the three TSURF prognostic 

316 variables (specific to each soil moisture regime, section 4a), while the EnKF state vector for 

317 Noah consists of the top layer soil temperature TSOIL1 . The key ingredients to the Kalman gain 

318 are the error correlations between the LST variables in observation space and the EnKF state 

319 variables (Reichle et al. 2002b). For Noah, the relevant error correlation is between the 

320 diagnostic TSKIN variable (that has no associated heat capacity) and the temperature in the top 

321 10 cm soil layer. Therefore, in the case of Noah the error correlation is affected by a small phase 

322 shift between the diurnal cycle of the diagnostic TSKIN and the top layer soil temperature 
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323 TS0IL1 . Observations are not assimilated when precipitation is falling or when the ground is 

324 covered with snow (as indicated by the land model integration). 

325 

326 4.c. Bias estimation 

327 For the reasons outlined in section 2 there are considerable differences between the temporal 

328 moments of the satellite and model LST (see section 6 for examples). Such biases need to be 

329 addressed in the data assimilation system. For this study, we implemented two different 

330 strategies. The first strategy is to scale the satellite observations to the model’s climatology so 

33 1 that the first and second moments of the satellite LST and the model LST match. This strategy is 

332 a simplified version of the cumulative distribution function matching (Reichle and Koster 2004) 

333 that has been used successfully for soil moisture assimilation (Reichle et al. 2009). Because of 

334 the strong diurnal and seasonal cycles of LST, we estimated the multi-year LST mean and 

335 variance separately for each calendar month and for eight different times-of-day (Oz, 3z, . . ., 

336 21z). This strategy is very easy to implement through preprocessing of the LST retrievals and 

337 makes no assumptions regarding whether the model’s or the observations’ climatology is more 

338 correct. Although the assimilation estimates are by design produced in the model’s climatology, 

339 they could be scaled back to the observational climatology if desired. The scaling strategy can 

340 be applied to the assimilation of retrievals from a variety of satellite datasets with different 

341 climatologies. An obvious disadvantage is the fact that the a priori scaling is static and cannot 

342 automatically adjust to dynamic changes in bias. 

343 

344 The second strategy is to dynamically estimate bias parameters along the lines of the algorithm 

345 developed by Dee (2005), which was used for LST by Bosilovich et al. (2007) and for soil 
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346 moisture by De Lannoy et al. (2007). This dynamic bias estimation approach is based on a 

347 second Kalman filter for bias estimation (in addition to the Kalman filter for state estimation). 

348 Assume that we have a bias estimate b t _i + at time t- 1 . Furthermore, assume that this bias 

349 estimate can be propagated to time t with a simple bias evolution model 

350 

351 (4) b t ~ =ab t _i + 

352 

353 that relaxes the bias estimates to zero (0 < a < 1). In our experiments, we chose a to correspond 

354 to an e-folding scale of 1 day. The use of a relaxation factor is different from the implementation 

355 of Bosilovich et al. (2007) and is a prudent strategy for experiments that cover many seasons. 

356 Because observations may not be available for extended periods, relaxing the bias estimate to 

357 zero is safer than keeping the latest bias estimate through seasons for which it may not be 

358 appropriate. 

359 

360 Next, we compute a bias-corrected model forecast 

361 

362 (5) iUi" = Xt,f-bt _ 

363 

364 that is used in the state update equation (2) (instead of the biased model forecast x tj ; ). From 

365 ensemble average innovations (computed as y t - H t 2F = E{y tj i - H t where E{-} is the 

366 ensemble mean operator), we can then update the bias via 

367 

368 (6) b t + = b t " - y K x , t (y t - H t E, t ~) 
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369 


370 A key assumption of this algorithm is that the bias error covariance Pb,t is a small fraction of the 

37 1 state error covariance, that is Pb.t = y P x ,t, which implies that the gain for the bias (Kb,t) can be 

372 computed as a fraction of the gain for the state, which has already been computed. Here, we use 

373 y = 0.2 for Catchment and y = 0.05 for Noah. The difference in y is motivated by the different 

374 surface layer thicknesses in the two models, but like the perturbations parameters that govern the 

375 model error, these bias parameters have not been optimized and are justified primarily by the 

376 success of the assimilation. 

377 

378 As fonnulated above, the bias algorithm estimates a single bias parameter (per state and per 

379 location). Here, we implemented a variant that estimates a separate bias parameter for eight 

380 different times-of-day (Oz, 3z, . . ., 21z). Because this requires the estimation of eight bias 

381 parameters per state and per location, we refer to this algorithm as "b8". Additional variants of 

382 the dynamic bias algorithm are discussed in the appendix. 

383 

384 Equation (6) implies that in practice, the bias estimates can be thought of as an exponential 

385 moving (time) average of the LST increments. Unlike the scaling approach, the dynamic bias 

386 estimation strategy adapts to slow changes in bias over time. A major disadvantage of this 

387 strategy is the implicit assumption that only the model is biased, which contradicts previous 

388 findings that retrievals from different sensors may be biased against each other (for example, 

389 Trigo and Viterbo 2003). It is therefore critical that any bias between retrievals from different 

390 sensors is small compared to the bias between retrievals and the model estimates. 

391 
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392 For this study we implemented the a priori scaling method and the dynamic bias estimation 

393 schemes independently. The latter can be invoked with and without a priori scaling. If invoked 

394 without a priori scaling, the dynamic bias estimation corrects for static (long-term) biases as well 

395 as shorter-term "bias" that results from transient differences between model and observational 

396 estimates. If invoked after a priori scaling, the dynamic bias estimation mostly corrects for 

397 transient bias. It can also be considered a tool for remembering assimilation increments that 

398 would otherwise be forgotten within a single model time step (because of the small heat capacity 

399 associated with LST). This is particularly important for CLSM as section 6 will show. 

400 
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40 1 5. Experiment design and skill metrics 

402 The experiment domain consists of the catchments (or grid cells) that contain the CEOP station 

403 locations (Figure 1). The land models were spun up by cycling 10 times through the 4-year 

404 period from 1 January 2001 to 1 January 2005. The models were then integrated in ensemble 

405 mode (12 members; using the perturbations settings of Table 2) for the same 4-year period. 

406 These open loop integrations also recorded the LST innovations (without applying any 

407 increments) for the computation of the model and retrieval statistics that are required for the a 

408 priori scaling approach. The (ensemble) assimilation integrations covered the same 4-year 

409 period and were validated against the 27 months of CEOP observations from 1 October 2002 

410 through 1 January 2005. 

411 

412 For each land model, we conducted one open loop (no assimilation) ensemble integration and 

413 four different experiments in which ISCCP retrievals were assimilated assuming an observation 

414 error standard deviation of 2 K. Two of the four assimilation integrations (per model) were 

415 perfonned with the (unsealed) ISCCP retrievals ("sO" for "no scaling"), the other two utilized 

416 ISCCP retrievals that were scaled to each model’s LST climatology prior to assimilation ("si"; 

417 section 4c). In each set of two assimilation integrations, one was done without bias correction 

418 ("bO"), and the other used the dynamic bias algorithms ("b8"; section 4c). For each model, we 

419 thus compare four assimilation integrations: "sObO", "s0b8", "slbO", and "slb8". 

420 

42 1 All integrations were analyzed by computing RMSE values (from raw time series) and anomaly 

422 correlation coefficients (R; from anomaly time series) for LST, LH, SH, and GH with respect to 

423 the available in situ CEOP observations. These performance metrics were first computed 
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424 separately for each station from 3 -hourly time series and then averaged over the available 

425 stations. Anomaly time series were computed by subtracting the monthly mean seasonal and 

426 diurnal cycle (climatology) from the raw data. The climatologies and metrics were computed 

427 only from data at times and locations where ISCCP retrievals were available so that only clear- 

428 sky conditions are compared to the extent possible. Mean values for a given calendar month and 

429 time-of-day are computed only if a minimum of 20 data points are available; mean monthly 

430 values are computed only if mean values are available for all eight times-of-day for the month in 

43 1 question; and performance metrics are based on at least 100 data points. 

432 

433 We analyze two performance metrics, RMSE and anomaly R, to highlight the advantages and 

434 disadvantages of the various assimilation approaches. Given the typically strong seasonal and 

435 diurnal cycles of LST and land surface fluxes, metrics based on raw data are dominated by errors 

436 in the climatology. Metrics based on anomalies, by contrast, primarily capture day-to-day 

437 variations. RMSE values measure how closely the data agree in their original units and are 

438 affected by a mean bias or a mean difference in the amplitude of variations. R values, on the 

439 other hand, are not affected by such biases and only capture the phasing between the estimates 

440 and the validating observations. The choice of metric depends on the application at hand 

441 (Entekhabi et al. 2010). RMSE values are most relevant if absolute errors matter most. In other 

442 cases, anomaly R values may be of most relevance, for example in model-based applications 

443 (such as Numerical Weather Prediction) that could correct for known biases in the mean and 

444 variance. 

445 
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446 6. Results 

447 6.a. Aggregate performance 

448 The station-average RMSE and R metrics, evaluated against the CEOP in situ observations as 

449 discussed in the previous section, measure the aggregate perfonnance of the satellite, model, and 

450 assimilation estimates. In this section, we highlight the performance of select assimilation 

45 1 integrations in terms of select metrics (see appendix for a complete table of metrics and 

452 algorithms). Before turning to the assimilation integrations, however, we first assess the skill of 

453 the satellite retrievals and of the model integrations without assimilation (open loop). The top 

454 panel of Figure 3 illustrates the RMSE values computed from the raw LST estimates, which are 

455 4.9 K for CLSM (yellow bar), 5.6 K for Noah (light blue bar), and 7.6 K for ISCCP (black bar). 

456 The bottom panel shows corresponding RMSE values for model estimates of LH, SH, and GH, 

457 which range from 50 to 67 W m' and are within a factor of two of typical measurement errors 

458 (around 30 W m' ) for surface turbulent fluxes (Finkelstein and Sims 2001, Hollinger and 

459 Richardson 2005). The first important results of Figure 3 are therefore that (i) the CLSM and 

460 Noah open loop integrations show similar skill when compared to CEOP in situ observations and 

461 that (ii) the model estimates of LST are significantly better than ISCCP retrievals. 

462 

463 The situation is similar for the anomaly R metric, shown in Figure 4 for LST only. The models 

464 show reasonable skill in terms of reproducing standardized anomalies, with anomaly R values of 

465 about 0.6. Again, ISCCP retrievals are significantly less skillful than model estimates (anomaly 

466 R value of 0.52). An analysis of sampling error reveals that 95 % confidence intervals for all R 

467 values discussed here are less than ±0.01. For RMSE values, 95 % confidence intervals are less 

468 than ±0.1 K for LST and less than ±1 W nf 2 for surface fluxes. In the following, RMSE and 
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469 anomaly R values are rounded accordingly (skill difference values rounded after computing the 

470 difference). 

471 

472 Obviously, the superior skill of the model LST estimates relative to the skill of the ISCCP 

473 retrievals limits the improvements that can be expected from assimilating the ISCCP data. 

474 Nevertheless, as will be shown, modest yet statistically significant improvements can be 

475 achieved through the assimilation of ISCCP LST retrievals. We expect that the use of a priori 

476 scaling produces better anomaly estimates, whereas omitting a priori scaling should yield better 

477 improvements in terms of absolute numbers (that is, raw data) due to likely biases in model 

478 climatology. Thus, for RMSE computed from raw data (Figure 3), we focus on assimilation 

479 without a priori scaling (sObO, s0b8). The CLSM assimilation integration without dynamic bias 

480 correction (sObO; orange bars) is characterized by only minor improvements in LST (0.2 K for 

48 1 RMSE) and virtually no changes in the land surface flux estimates (when compared to the open 

482 loop skill). Recall that CLSM’s prognostic "surface temperature" represents the canopy and top 

483 5 cm of soil and is associated with a very small heat capacity. Consequently, without dynamic 

484 bias estimation and correction, increments from the assimilation of ISCCP retrievals dissipate 

485 quickly and have little impact on the model state in CLSM. Increments applied to the Noah 

486 model’s top layer (10 cm) soil temperature, on the other hand, have a somewhat more noticeable 

487 effect on the model state. For assimilation without a priori scaling and without dynamic bias 

488 correction (sObO), the RMSE for Noah LST estimates is improved by 0.5 K (medium blue bar in 

489 Figure 3). As for CLSM, the Noah assimilation estimates for the latent and sensible heat fluxes 

490 are comparable to the open loop estimates. However, the RMSE value for Noah sObO estimates 

491 of GH increases by 1 1 W m" (more on this later). 

492 
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493 Next, we analyze the skill of the assimilation integrations with dynamic bias estimation and 

494 correction (s0b8), also illustrated in Figure 3. For CLSM, adding dynamic bias estimation (red 

495 bar) enhances the LST improvements (over the open loop) to 0.7 K in terms of RMSE. Using a 

496 bias-corrected model forecast at every model time step (equation (5)) enhances the impact of 

497 LST increments in the CLSM assimilation integrations and thereby yields enhanced 

498 improvements from the assimilation of the ISCCP retrievals (relative to not using the dynamic 

499 bias algorithm). For Noah, using the dynamic bias algorithm (s0b8; dark blue bar in Figure 3) 

500 yields only slightly better LST than the sObO assimilation integration without dynamic bias 

501 correction (RMSE now reduced by 0.6 K over the open loop). At the same time, however, the 

502 deterioration of the GH estimates is exacerbated in s0b8. The RMSE value for assimilation 

503 estimates of GH increases by 28 W m' when compared with the open loop RMSE. LH and SH 

504 estimates from s0b8 are again comparable to open loop estimates. 

505 

506 For the analysis of assimilation integrations with a priori scaling (slbO, slb8) we focus on the 

507 anomaly R metric, shown in Figure 4. Qualitatively, the results for these integrations are similar 

508 to those obtained without a priori scaling. In CLSM, a priori scaling alone (slbO) yields only 

509 small improvements in LST (anomaly R increases by 0.02). Most of the impact is realized 

510 through dynamic bias estimation (anomaly R for LST increases by 0.05 over the open loop). For 

511 Noah, on the other hand, the anomaly R for LST already increases by 0.04 (over the open loop) 

512 when only a priori scaling is applied. The estimates get only slightly better when dynamic bias 

513 correction is added (increase of 0.05 in anomaly R for LST). There is also a deterioration in the 

514 Noah GH estimates (relative to the open loop) when a priori scaling is used (with or without bias 

515 correction), but generally the loss of skill in GH is mitigated through a priori scaling (not shown, 

516 see appendix). Noah assimilation estimates of LH and SH have marginally better anomaly R 
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517 values than open loop estimates (not shown), but these improvements are so small that we do not 

518 consider them to be relevant. 

519 

520 In summary, using dynamic bias estimation for CLSM (s0b8, slb8) provides the best 

521 assimilation estimates and enables modest LST improvements (over the open loop) of up to 0.7 

522 K in tenns of RMSE and up to 0.05 in terms of anomaly R. Flux estimates from these CLSM 

523 assimilation integrations are essentially identical to open loop estimates - assimilation of LST 

524 does not lead to improved flux estimation. For Noah, assimilation without a priori scaling and 

525 without dynamic bias estimation (sObO) already yields most of the benefit of assimilating the 

526 satellite retrievals. Using a priori scaling and/or dynamic bias estimation yields only small 

527 additional improvements. For Noah, LST improvements (over the open loop) are similar to 

528 those for CLSM: up to 0.6 K for RMSE and up to 0.05 in terms of anomaly R. Noah 

529 assimilation estimates of SH and LH are similar to the open loop estimates, but the assimilation 

530 estimates of GH are considerably worse than the open loop estimates (up to 28 W m' 2 increase in 

53 1 RMSE), with lesser degradation seen when a priori scaling is used. Without a priori scaling, the 

532 worsening of the GH estimates in Noah may well outweigh the benefits of the LST 

533 improvements. 

534 

535 As shown in Figure 1 and Table 1, most CEOP stations either have LST and GH measurements 

536 or have LH and SH measurements. Only four stations measure LST and all three fluxes. The 

537 results in this section must be interpreted with this caveat in mind. Conceivably, if a large 

538 number of stations were able to provide LST, LH, and SH measurements together, and if our 

539 analyses were limited to that set of stations, we might indeed be able to show that improved LST 

540 estimates from assimilation correspond to improved estimates of the turbulent fluxes. Given the 
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541 limitations of the available in situ measurements, we cannot know for sure. Note, however, that 

542 15 out of the 19 stations that have LST measurements also have GH measurements, suggesting 

543 that for Noah the LST improvements from assimilation are connected to the worsening of GH 

544 estimates. 

545 

546 As discussed in section 5, the RMSE and anomaly R values shown here are based on 3-h average 

547 data, including nighttime and wintertime when fluxes are small and noise may overwhelm the 

548 signal. In a separate analysis (not shown) we also computed the perfonnance metrics from daily 

549 average data. While RMSE values are generally lower when based on daily averages (as can be 

550 expected from increased averaging) and R values are also somewhat different in overall 

551 magnitude, the relative perfonnance is similar regardless of whether metrics are computed from 

552 daily or 3-h average data, and the main conclusions of this section thus remain unchanged. 

553 In a second separate analysis (not shown), we assessed the performance based strictly on 

554 summertime data and again find that our conclusions remain the same. 

555 

556 6.b. Seasonal and diurnal cycles 

557 Estimates of the mean seasonal and diurnal cycles provide additional insights into the modeling 

558 and assimilation of LST. Figure 5 (left panels) shows the mean seasonal cycle of LST at two 

559 locations, Bondville in the US Midwest and BJ-SAWS3 in Tibet. At both locations, the seasonal 

560 cycle estimates of the CLSM and Noah open loop integrations agree fairly closely with each 

561 other (to within 2 K), primarily because both models are driven with the same surface 

562 meteorological forcing data. At Bondville, the open loop estimates of the seasonal cycle also 

563 agree closely with the in situ CEOP observations. At the Tibetan station, however, the open loop 

564 estimates are biased low by about 5 K (relative to the in situ observations). In contrast, ISCCP 
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565 estimates of the seasonal cycle are biased high by about 3 K at BJ-SAWS3 and differ by up to 5 

566 K in the first half of the year at Bondville. 

567 

568 By construction, the seasonal and diurnal cycle estimates of assimilation integrations with a 

569 priori scaling closely match those of the open loop integration (not shown). Figure 5 also shows 

570 the seasonal climatology of assimilation integrations without a priori scaling and with dynamic 

571 bias estimation and correction (s0b8). As expected, the s0b8 assimilation integrations draw more 

572 closely to the ISCCP retrievals (when compared to the open loop). At BJ-SAWS3, this 

573 fortuitously brings the assimilation estimates of the seasonal cycle into better agreement with the 

574 in situ observations than either the ISCCP or the open loop estimates. 

575 

576 Figure 5 (right panels) also illustrates the mean August diurnal cycle estimates at Bondville and 

577 BJ-SAWS3. At Bondville, the Noah open loop estimates have a slightly higher diurnal 

578 amplitude than the CLSM estimates (because LST for Noah exceeds that of CLSM by up to 2 K 

579 during the day and by less than IK during the night). At BJ-SAWS3, the open loop integrations 

580 agree closely with each other. Similar to the seasonal cycle estimates, the open loop estimates at 

581 Bondville are in reasonable agreement with the in situ observations but are biased low (by about 

582 5 K) at BJ-SAWS3. ISCCP retrievals, on the other hand, exhibit a weaker diurnal amplitude at 

583 Bondville than model or in situ observations. At BJ-SAWS3, ISCCP retrievals are biased high 

584 (compared to the in situ observations) in the morning and mid-day but biased low in the evening 

585 and at nighttime. 

586 

587 Again, the LST diurnal cycle estimates of the CLSM s0b8 assimilation integrations draw towards 

588 the ISCCP retrievals by construction (Figure 5, right panels). This brings them closer to the in 
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589 situ observations at BJ-SAWS3 but makes CLSM nighttime estimates at Bondville worse when 

590 compared to the open loop. By contrast, LST from the Noah s0b8 assimilation integration is 

591 similar to the open loop integration during daytime. Only during the evening hours (at Bondville 

592 and BJ-SAWS3) are the Noah s0b8 LST estimates noticeably closer to the ISCCP retrievals. 

593 The delayed impact of the LST assimilation is probably a result of the phase lag between the 

594 diagnostic LST (observations space) and the top layer soil temperature (state space) in Noah. 

595 

596 6.c. Filter diagnostics 

597 Internal filter diagnostics offer further clues about the perfonnance of the assimilation 

598 algorithms. For a filter that operates according to its underlying assumptions (that various 

599 linearizations hold, that model and observation errors are unbiased, uncorrelated and normally 

600 distributed), the time average of the (ensemble mean) innovations sequence (y t - H t x t ~) equals 

601 zero. Moreover, the standard deviation of the "normalized" innovations (y t - H t x t ~)-( H t P x t H t x 

602 + R t )"° 5 equals one (Reichle et al. 2002a). The latter diagnostic compares the actual spread in the 

603 innovations to what the filter expects. A simple interpretation is that the assumed error bars of a 

604 model forecast and its corresponding observation must have an appropriate overlap. 

605 

606 Figure 6 displays the distribution of these two internal filter diagnostics across the CEOP stations 

607 listed in Table 1. The top panel indicates that without a priori scaling of the ISCCP observations 

608 and without dynamic bias estimation (sObO), biases of several Kelvin typically persist in the 

609 model forecast and are reflected in the mean of the innovations. (The innovations statistics of 

610 the open loop integrations are essentially the same as those of the sObO integrations and are not 

611 shown in the figure.) A very modest reduction of the bias can be achieved with the "b8" 

612 dynamic bias estimation algorithm. If in addition to dynamic bias estimation the observations 
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613 are also scaled prior to data assimilation (slb8), the innovations mean at all stations is, by 

614 construction, much closer to zero. These results hold for assimilation into both land models, 

615 CLSM and Noah. 

616 

617 The standard deviation of the normalized innovations, shown in the bottom panel of Figure 6, 

618 exceeds the target value of one at almost all stations and for almost all algorithms. This indicates 

619 that the model and/or the observation error standard deviations were underestimated. Like for 

620 the innovations mean, a priori scaling brings the standard deviation of the normalized 

62 1 innovations much closer to its expected value of one. This implies that a large part of the 

622 mismatch between the actual spread in the innovations and the expected spread is simply due to 

623 bias. Finally, the fact that the innovations diagnostics are comparable for the CLSM and Noah 

624 assimilation integrations indicates that the assimilation performance (relative to its unknown 

625 optimum) is comparable for the two models, which lends further support to the broad 

626 conclusions reached in this paper. In summary, a priori scaling in combination with dynamic 

627 bias estimation exhibits the best performance in terms of the innovations diagnostics, 

628 independent of the land model used. 

629 

630 6.d. Impact of bias in data assimilation 

63 1 Estimates from a properly designed assimilation system should be no worse than open loop 

632 estimates. The example in Figure 7 further illustrates the potentially serious detrimental impact 

633 of not addressing bias properly in data assimilation. The figure shows LST and land surface flux 

634 time series from select Noah integrations for a few days in August of 2003 at the MGS station in 

635 Mongolia. At this location, LST and GH estimates from the open loop integrations agree fairly 

636 well with CEOP in situ observations. The daytime peak LST estimates from ISCCP, however, 
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637 are warmer by as much as 30 K. This extreme bias may be due to one or more of the reasons 

638 discussed in section 2. 

639 

640 At any rate, when the unsealed ISCCP retrievals are assimilated without a priori scaling (sObO, 

641 s0b8), LST assimilation estimates are drawn toward the extreme temperatures in the ISCCP 

642 retrievals (Figure 7). However, since Noah is not designed to accommodate such extreme 

643 temperatures, and because the surface meteorological forcing remains unchanged in the system, 

644 the Noah assimilation integrations without a priori scaling respond with unrealistic and excessive 

645 estimates of sensible and ground heat flux, most notably on 13 August 2003. Because the impact 

646 on LST of assimilating unsealed ISCCP retrievals without bias correction (sObO) is more limited, 

647 the corresponding flux estimates are less pathological than in the s0b8 case with dynamic bias 

648 correction. For reference, Figure 7 also shows an assimilation integration with a priori scaling 

649 and dynamic bias correction (slb8), which does not produce such unrealistic flux estimates. 

650 

65 1 The situation is similar for LST from CLSM assimilation integrations for the same location and 

652 time period (not shown), but for CLSM we obtain unrealistic estimates of the latent heat flux for 

653 select assimilation integrations without a priori scaling. To summarize, Figure 7 illustrates the 

654 pitfalls of assimilating LST retrievals that are severely biased against model LST. While the 

655 assimilation can be designed to produce LST estimates that are closer to the satellite retrievals, 

656 there may be unintended and undesirable side effects in terms of the land surface fluxes and the 

657 surface energy balance. 

658 
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659 7. Summary and conclusions 

660 An ensemble-based data assimilation method, the EnKF, was used with and without a priori 

661 scaling of observations and/or dynamic bias estimation methods to assimilate satellite retrievals 

662 of LST into two different land surface models at 48 CEOP sites scattered around the globe. The 

663 two land models, CLSM and Noah, follow distinct modeling approaches for land surface 

664 temperature. CLSM has model prognostic "surface temperature" variables, whereas Noah 

665 diagnoses the surface temperature from the surface energy balance. The LST, sensible, latent, 

666 and ground heat flux estimates from the data assimilation integrations were validated against 27 

667 months of CEOP in situ observations. 

668 

669 The main conclusions from the experiments are as follows. 

670 (1) There are strong biases between LST estimates from in situ observations, land modeling, and 

67 1 satellite retrievals that vary with season and time-of-day. Biases of a few Kelvin are typical, 

672 with larger values exceeding 10 K. 

673 (2) The skill of LST estimates from the CLSM and Noah land model integrations is superior to 

674 that of the ISCCP satellite retrievals. 

675 (3) Assimilation of ISCCP LST retrievals into the land surface models can improve LST 

676 estimates by up to 0.7 K for RMSE and by up to 0.05 for anomaly R, while not making surface 

677 turbulent fluxes worse. 

678 (4) Gross errors in surface flux estimates can result if biases are not taken into account properly, 

679 with a combination of a priori scaling and dynamic bias estimation methods yielding the best 

680 overall results. 
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68 1 (5) Assimilation diagnostics for integrations without a priori scaling strongly reflect the 

682 underlying biases, indicating that without a priori scaling the assimilation system is far from 

683 operating in accordance with its underlying assumptions. 

684 (6) Provided the assimilation system is properly configured for each land model, the benefits 

685 from the assimilation of LST retrievals are comparable for both land models. 

686 

687 There are many reasons why the improvements from the assimilation of satellite LST, while 

688 statistically significant, turn out to be modest. First and foremost, the skill of the satellite data is 

689 modest and much lower than that of the model to begin with. The infonnation gained by 

690 assimilating the satellite data into the model is therefore naturally limited. In the present study, 

691 the parameters of the assimilation system, including the perturbations (or model error) 

692 parameters and the parameters of the bias algorithm (a, y) were not optimized. Additional 

693 calibration may further improve the results and may also reveal differences in what can be 

694 achieved with a given land model structure. Finally, even if the assimilation estimates were 

695 perfect, the perfonnance metrics would not show it because of errors in the in situ data and 

696 because of the mismatch of the spatial and temporal characteristics of the satellite, model, and in 

697 situ data sets. 

698 

699 By design, the present study was limited to LST (state) estimation in an uncoupled land 

700 modeling system. Ideally, the land model parameters would be calibrated to minimize LST 

701 biases prior to data assimilation, or perhaps even dynamically within the data assimilation 

702 system. Such a calibration could rely on sophisticated parameter estimation methods (Vrugt et 

703 al. 2003). Perhaps more importantly, though, is that the surface meteorological forcings in the 
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704 present experiments were fixed. In other words, LST increments did not feed back onto the 

705 atmospheric state. Future experiments will explore LST assimilation into a coupled land- 

706 atmosphere model with the methods proposed here. The relative performance of the algorithms 

707 may very well change in the coupled environment. 

708 

709 Another path for future research is to investigate further the role of specific aspects of LST 

710 modeling. In the present paper, Noah was integrated in its default configuration, including a 10 

711 cm thick surface layer, which implied a small phase shift between the Noah diagnostic LST 

712 (used in the computations of the innovations) and the Noah top soil temperature (to which the 

713 increments were applied). Use of a thinner soil layer may alleviate the problems related to the 

714 phase shift in Noah between LST (observation space) and the top layer soil temperature (state 

715 space). 

716 

717 Even if the assimilation manages to improve LST only modestly and fluxes not at all, the impact 

718 may be significant because minor improvements in LST may already increase the number of 

719 atmospheric retrievals that can be assimilated in coupled systems, thereby possibly providing 

720 substantial indirect benefits. Obviously, the present study only scratches the surface of a very 

72 1 complex problem that has been a challenge for many years. Nevertheless, given the relative 

722 abundance of LST observations from satellites and the importance of accurate LST estimates, in 

723 particular in the context of atmospheric data assimilation, the results of the present study offer an 

724 encouraging step forward in land data assimilation. 

725 

726 
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727 Appendix 

728 In addition to the dynamic bias algorithm "b8" discussed in section 4c we also tested additional 

729 variants of the algorithm used by Bosilovich et al. (2007). Specifically, we tested the simplest 

730 possible variant that estimates a single, "mean" bias estimation parameter per state and per 

731 location, here referred to as "bl". Bosilovich et al. (2007) also introduced a bias 

732 parameterization with sine and cosine functions that accounts for variations in the diurnal cycle 

733 of the bias. Here we implemented two variants: a "diurnal" bias parameterization (constant term 

734 plus sine and cosine waves with a period of one day) and a "semi-diurnal" bias parameterization 

735 ("diurnal" terms plus sine and cosine waves with a period of one half day). The "diurnal" and 

736 "semi-diurnal" algorithms estimate three and five bias parameters, respectively, per state and 

737 location, and are referred to as "b3" and "b5". 

738 

739 For the "b8" integrations discussed in the main text we always applied both the state increments 

740 (equation (2)) and the bias increments (equation (6)). More generally, though, any assimilation 

741 integration that uses dynamic bias estimation can also be done without applying the state 

742 increments (as, in fact, implemented by Bosilovich et al. (2007)). We also tested these variants. 

743 Per model, we therefore tested a total of 18 different assimilation integrations, listed in Table 3. 

744 

745 A close examination of Table 3 reveals that generally, the assimilation integrations show more 

746 skill when more bias parameters are used. Reductions in RMSE values for LST are greater by up 

747 to 0.4 K for CLSM and by up to 0.2 K for Noah when comparing "b8" and "bl "integrations. 

748 Corresponding differences in anomaly R values for LST are up to 0.03, respectively, for both 

749 models. As can be expected, these differences across bias estimation algorithms of varying 
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750 complexity tend to be greater without a priori scaling. Also in line with expectations, taking 

751 into account the diurnal cycle of the bias (as in "b3", "b5", and "b8", as opposed to "bl") has the 

752 biggest impact. 

753 

754 Finally, applying the state increments in addition to the bias increments contributes only a small 

755 amount of skill, typically less than 0.1 K in tenns of RMSE reduction (or 0.01 in terms of R 

756 increase) for LST estimates. In the case of Noah, applying the state increments contributes 

757 commensurately to the worsening of the GH estimates. 

758 

759 To summarize, as long as the dynamic bias algorithm takes the diurnal cycle into account, the 

760 differences that result from the exact number of bias parameters used or that result from not 

761 applying the state increments are much smaller than the assimilation improvements over the 

762 open loop. In other words, the lessons learned in the main text about assimilation of LST 

763 retrievals in general and about using a priori scaling and/or dynamic bias correction are 

764 insensitive to the details of the dynamic bias estimation algorithm, provided the algorithm 

765 considers the diurnal cycle of the bias. 

766 
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927 Table captions 

928 

929 Table 1. CEOP stations with sufficient in situ LST, LH, SH, or GH observations for validation 

930 (CSE=Continental-Scale Experiment). 

931 

932 Table 2. Parameters for perturbations to meteorological forcing inputs and model prognostic 

933 variables. 

934 

935 Table 3. Skill of model and assimilation integrations versus CEOP in situ observations. 

936 Integrations shown in bold face are those discussed in the main text. Anomaly R for LST from 

937 ISCCP is 0.52, RMSE is 7.6 K. 

938 

939 


44 



940 Figure captions 

941 

942 Fig. 1. Location of CEOP stations. Stations suitable for validation are indicated with circles 

943 (SH, LH, 30 stations), plus signs (GH, 20 stations), and crosses (LST, 19 stations). 

944 

945 Fig. 2. LST is described in (left) CLSM with a prognostic variable ("TSURF") and in (right) 

946 Noah with a diagnostic variable ("TSKIN"). LST increments are applied to "TSURF" in CLSM 

947 and to "TSOIL1" in Noah (section 4.b.). 

948 

949 Fig. 3. RMSE versus CEOP in situ observations for (top) LST and (bottom) flux estimates from 

950 ISCCP retrievals (LST only), model integrations, and select assimilation integrations without a 

95 1 priori scaling. 

952 

953 Fig. 4 . R versus CEOP in situ observations for LST anomalies from ISCCP retrievals, model 

954 integrations, and select assimilation integrations with a priori scaling. 

955 

956 Fig. 5 . LST (left) annual seasonal and (right) August diurnal cycle at (top) Bondville and 

957 (bottom) BJ-SAWS3 for CEOP, ISCCP, model, and assimilation data. 

958 

959 Fig. 6. (Top) Mean of innovations [K] and (bottom) standard deviation of normalized 

960 innovations [dimensionless] for (C) Catchment and (N) Noah assimilation integrations. The box 

96 1 plots indicate the average, standard deviation, minimum and maximum of the respective 

962 innovations diagnostic across the stations listed in Table 1. 
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963 


964 Fig. 7. (Top) LST, (upper middle) LH, (lower middle) SH, and (bottom) GH for Noah 

965 integrations, ISCCP retrievals, and CEOP observations at the MGS station. 

966 
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CEOP Identifier 

Coordinates 

Data Availability 

CSE 

Reference Site 

Station 

Lat 

Lon 

LST 

LH+SH 

GH 

BALTEX 

Cabauw 

Cabauw 

51.97 

4.93 

yes 

yes 

yes 

BALTEX 

Lindenberg 

Falkenberg 

52.17 

14.12 

yes 

yes 

yes 

BALTEX 

Lindenberg 

Forest 

52.18 

13.95 

yes 

yes 

yes 

CAMP 

CbaoPlirayaRiver 

Lampang 

18.40 

99.47 

no 

no 

yes 

CAMP 

Himalayas 

Pyramid 

27.96 

86.81 

no 

no 

yes 

CAMP 

Mongolia 

BTS 

46.78 

107.14 

yes 

no 

yes 

CAMP 

Mongolia 

DGS 

46.13 

106.37 

yes 

no 

yes 

CAMP 

Mongolia 

DRS 

46.21 

106.71 

yes 

no 

yes 

CAMP 

Mongolia 

MGS 

45.74 

106.26 

yes 

no 

yes 

CAMP 

NorthEastThai 

N akhonracbasima 

14.47 

102.38 

yes 

no 

yes 

CAMP 

Tibet 

Amdo-Tower 

32.24 

91.62 

no 

no 

yes 

CAMP 

Tibet 

ANNI-AWS 

31.25 

92.17 

yes 

no 

yes 

CAMP 

Tibet 

BJ-SAWS1 

31.37 

91.90 

yes 

no 

no 

CAMP 

Tibet 

BJ-SAWS2 

31.37 

91.90 

yes 

no 

no 

CAMP 

Tibet 

BJ-SAWS3 

31.37 

91.90 

yes 

no 

no 

CAMP 

Tibet 

BJ-Tower 

31.37 

91.90 

yes 

no 

yes 

CAMP 

Tibet 

D105-AWS 

33.06 

91.94 

yes 

no 

yes 

CAMP 

Tibet 

D66-AWS 

35.52 

93.78 

yes 

no 

no 

CAMP 

Tibet 

Gaize 

32.30 

84.05 

yes 

no 

yes 

CAMP 

Tibet 

MS3478-AWS 

31.93 

91.71 

yes 

no 

yes 

CAMP 

Tibet 

MS3608-AWS 

31.23 

91.78 

yes 

no 

no 

CAMP 

Tongyu 

Cropland 

44.42 

122.87 

no 

yes 

yes 

CAMP 

Tongyu 

Grassland 

44.42 

122.87 

no 

yes 

yes 

GAPP 

Bondville 

Bondville 

40.01 

-88.29 

yes 

yes 

yes 

GAPP 

SGP 

E 1 Lamed 

38.20 

-99.32 

no 

yes 

no 

GAPP 

SGP 

E 2 Hillsboro 

38.31 

-97.30 

no 

yes 

no 

GAPP 

SGP 

E 3 Le Roy 

38.20 

-95.60 

no 

yes 

no 

GAPP 

SGP 

E 4 Plevna 

37.95 

-98.33 

no 

yes 

no 

GAPP 

SGP 

E 5 Halstead 

38.11 

-97.51 

no 

yes 

no 

GAPP 

SGP 

E 6 Towanda 

37.84 

-97.02 

no 

yes 

no 

GAPP 

SGP 

E 7 Elk Falls 

37.38 

-96.18 

no 

yes 

no 

GAPP 

SGP 

E 8 Coldwater 

37.33 

-99.31 

no 

yes 

no 

GAPP 

SGP 

E 9 Ashton 

37.13 

-97.27 

no 

yes 

no 

GAPP 

SGP 

E 10 Tyro 

37.07 

-95.79 

no 

yes 

no 

GAPP 

SGP 

E 12 Pawhuska 

36.84 

-96.43 

no 

yes 

no 

GAPP 

SGP 

E 13 Lamont 

36.60 

-97.48 

no 

yes 

no 

GAPP 

SGP 

E 14 Lamont 

36.61 

-97.49 

no 

yes 

no 

GAPP 

SGP 

E 1 5 Ringwood 

36.43 

-98.28 

no 

yes 

no 

GAPP 

SGP 

E 16 Vici 

36.06 

-99.13 

no 

yes 

no 

GAPP 

SGP 

E 18 Morris 

35.69 

-95.86 

no 

yes 

no 

GAPP 

SGP 

E 19 El Reno 

35.55 

-98.02 

no 

yes 

no 

GAPP 

SGP 

E 20 Meeker 

35.56 

-96.99 

no 

yes 

no 

GAPP 

SGP 

E 21 Okmulgee 

35.62 

-96.06 

no 

yes 

no 

GAPP 

SGP 

E 22 Cordell 

35.35 

-98.98 

no 

yes 

no 

GAPP 

SGP 

E 24 Cyril 

34.88 

-98.20 

no 

yes 

no 

GAPP 

SGP 

E 26 Cement 

34.96 

-98.08 

no 

yes 

no 

GAPP 

SGP 

E 27 Earlsboro 

35.27 

-96.74 

no 

yes 

no 

MDB 

Tumbarumba 

Tumbarumba 

-35.65 

148.15 

no 

yes 

yes 


968 Table 1 . CEOP stations with sufficient in situ LST, LH, SH, or GH observations for validation 

969 (CSE=Continental-Scale Experiment). 
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Perturbation 

Additive (A) 
or 

Standard 

deviation 

AR(1) time 
series 

Cross-correlation 
with perturbations in 


multiplicative 

(M)? 


correlation 

scale 

T2M 

SW 

LW 

Air temperature 
(T2M) 

A 

1 K 

1 day 

n/a 

0.4 

0.4 

Downward shortwave 
(SW) 

M 

0.3 

1 day 

0.4 

n/a 

-0.6 

Downward longwave 
(LW) 

A 

20 W in 2 

1 day 

0.4 

-0.6 

n/a 

Soil temperature 
prognostic variables 
(Catchment: TSURF; 
Noah: TSOIL1) 

A 

0.2 K 

12 h 

0 

0 

0 


970 

97 1 Table 2. Parameters for perturbations to meteorological forcing inputs and model prognostic 

972 variables. 
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Assimilation 

Apply State Increments 

A Priori Scaling 

Bias Algorithm 

CLSM 

Noah 

Anomaly R 

RMSE 

Anomaly R 

RMSE 

LST 

LH 

SH 

GH 

LST 

LH 

SH 

GH 

LST 

LH 

SH 

GH 

LST 

LH 

SH 

GH 

[■] 

[■] 

[K] 

[Wm' 2 ] 

[-] 

[-] 

[K] 

[Wm' 2 

] 

no 

n/a 

n/a 

n/a 

0.61 

0.31 

0.21 

0.25 

4.9 

54 

67 

54 

0.62 

0.26 

0.12 

0.24 

5.6 

50 

62 

50 

yes 

yes 

sO 

bO 

0.63 

0.31 

0.21 

0.25 

4.7 

54 

67 

54 

0.66 

0.27 

0.14 

0.14 

5.1 

50 

61 

61 

yes 

yes 

sO 

bl 

0.64 

0.31 

0.21 

0.26 

4.6 

54 

67 

53 

0.64 

0.28 

0.14 

0.08 

5.2 

50 

62 

81 

yes 

yes 

sO 

b3 

0.66 

0.31 

0.21 

0.26 

4.4 

54 

68 

54 

0.65 

0.28 

0.14 

0.08 

5.1 

50 

62 

84 

yes 

yes 

sO 

b5 

0.66 

0.31 

0.21 

0.25 

4.4 

54 

68 

54 

0.65 

0.27 

0.14 

0.08 

5.1 

50 

62 

84 

yes 

yes 

sO 

b8 

0.67 

0.31 

0.21 

0.25 

4.2 

54 

68 

54 

0.67 

0.28 

0.14 

0.11 

5.0 

50 

62 

78 

yes 

no 

sO 

bl 

0.63 

0.31 

0.21 

0.26 

4.7 

54 

67 

53 

0.63 

0.28 

0.13 

0.10 

5.3 

50 

63 

77 

yes 

no 

sO 

b3 

0.65 

0.31 

0.21 

0.26 

4.4 

54 

68 

54 

0.64 

0.28 

0.13 

0.09 

5.2 

50 

63 

78 

yes 

no 

sO 

b5 

0.66 

0.31 

0.21 

0.26 

4.4 

54 

68 

54 

0.65 

0.28 

0.13 

0.09 

5.1 

50 

63 

78 

yes 

no 

sO 

b8 

0.66 

0.31 

0.21 

0.25 

4.3 

54 

68 

54 

0.66 

0.28 

0.13 

0.14 

5.0 

50 

62 

73 

yes 

yes 

si 

bO 

0.63 

0.31 

0.21 

0.25 

4.8 

54 

67 

54 

0.66 

0.26 

0.13 

0.20 

5.3 

50 

61 

53 

yes 

yes 

si 

bl 

0.65 

0.31 

0.21 

0.25 

4.6 

54 

67 

54 

0.66 

0.26 

0.13 

0.15 

5.2 

50 

62 

58 

yes 

yes 

si 

b3 

0.67 

0.31 

0.21 

0.25 

4.5 

54 

67 

54 

0.66 

0.26 

0.13 

0.15 

5.2 

50 

62 

58 

yes 

yes 

si 

b5 

0.67 

0.31 

0.21 

0.25 

4.5 

54 

67 

54 

0.66 

0.26 

0.13 

0.15 

5.2 

50 

62 

58 

yes 

yes 

si 

b8 

0.66 

0.31 

0.21 

0.25 

4.6 

54 

67 

54 

0.67 

0.27 

0.13 

0.17 

5.2 

49 

61 

56 

yes 

no 

si 

bl 

0.64 

0.31 

0.21 

0.25 

4.6 

54 

67 

54 

0.65 

0.26 

0.13 

0.15 

5.2 

50 

62 

57 

yes 

no 

si 

b3 

0.66 

0.31 

0.21 

0.25 

4.6 

54 

67 

54 

0.65 

0.27 

0.13 

0.15 

5.2 

50 

62 

57 

yes 

no 

si 

b5 

0.66 

0.31 

0.21 

0.25 

4.5 

54 

67 

54 

0.66 

0.27 

0.13 

0.15 

5.2 

50 

62 

57 

yes 

no 

si 

b8 

0.65 

0.31 

0.21 

0.25 

4.6 

54 

67 

54 

0.66 

0.27 

0.12 

0.17 

5.2 

50 

62 

55 


974 Table 3. Skill of model and assimilation integrations versus CEOP in situ observations. 

975 Integrations shown in bold face are those discussed in the main text. Anomaly R for LST from 

976 ISCCP is 0.52, RMSE is 7.6 K. 

977 
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978 

979 Fig. 1 . Location of CEOP stations. Stations suitable for validation are indicated with circles 

980 (SH, LH, 30 stations), plus signs (GH, 20 stations), and crosses (LST, 19 stations). 
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Fig. 2. LST is described in (left) CLSM with a prognostic variable ("TSURF") and in (right) 
Noah with a diagnostic variable ("TSKIN"). LST increments are applied to "TSURF" in CLSM 
and to "TSOIL1" in Noah (section 4.b.)- 
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Fig. 3. RMSE versus CEOP in situ observations for (top) LST and (bottom) flux estimates from 
ISCCP retrievals (LST only), model integrations, and select assimilation integrations without a 
priori scaling. 
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R of LST anomalies [-] 
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Fig. 4. R versus CEOP in situ observations for LST anomalies from ISCCP retrievals, model 
integrations, and select assimilation integrations with a priori scaling. 
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Fig. 5. LST (left) annual seasonal and (right) August diurnal cycle at (top) Bondville and 
(bottom) BJ-SAWS3 for CEOP, ISCCP, model, and assimilation data. 
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Fig. 6. (Top) Mean of innovations [K] and (bottom) standard deviation of normalized 
innovations [dimensionless] for (C) Catchment and (N) Noah assimilation integrations. The box 
plots indicate the average, standard deviation, minimum and maximum of the respective 
innovations diagnostic across the stations listed in Table 1. 
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Fig. 7. (Top) LST, (upper middle) LH, (lower middle) SH, and (bottom) GH for Noah 
integrations, ISCCP retrievals, and CEOP observations at the MGS station. 
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